Search CORE

151 research outputs found

Highly over-parameterized classifiers generalize since bad solutions are rare

Author: Martinetz Julius
Martinetz Thomas
Publication venue
Publication date: 07/11/2022
Field of study

We study the generalization of over-parameterized classifiers where Empirical Risk Minimization (ERM) for learning leads to zero training error. In these over-parameterized settings there are many global minima with zero training error, some of which generalize better than others. We show that under certain conditions the fraction of "bad" global minima with a true error larger than {\epsilon} decays to zero exponentially fast with the number of training data n. The bound depends on the distribution of the true error over the set of classifier functions used for the given classification problem, and does not necessarily depend on the size or complexity (e.g. the number of parameters) of the classifier function set. This might explain the unexpectedly good generalization even of highly over-parameterized Neural Networks. We support our mathematical framework with experiments on a synthetic data set and a subset of MNIST

arXiv.org e-Print Archive

Deep Convolutional Neural Networks as Generic Feature Extractors

Author: Barth Erhardt
Hertel Lars
Käster Thomas
Martinetz Thomas
Publication venue
Publication date: 06/10/2017
Field of study

Recognizing objects in natural images is an intricate problem involving multiple conflicting objectives. Deep convolutional neural networks, trained on large datasets, achieve convincing results and are currently the state-of-the-art approach for this task. However, the long time needed to train such deep networks is a major drawback. We tackled this problem by reusing a previously trained network. For this purpose, we first trained a deep convolutional network on the ILSVRC2012 dataset. We then maintained the learned convolution kernels and only retrained the classification part on different datasets. Using this approach, we achieved an accuracy of 67.68 % on CIFAR-100, compared to the previous state-of-the-art result of 65.43 %. Furthermore, our findings indicate that convolutional networks are able to learn generic feature extractors that can be used for different tasks.Comment: 4 pages, accepted version for publication in Proceedings of the IEEE International Joint Conference on Neural Networks (IJCNN), July 2015, Killarney, Irelan

arXiv.org e-Print Archive

CiteSeerX

Genetic Algorithms in Time-Dependent Environments

Author: Martinetz Thomas
Ronnewinkel Christopher
Wilke Claus O.
Publication venue
Publication date: 01/01/1999
Field of study

The influence of time-dependent fitnesses on the infinite population dynamics of simple genetic algorithms (without crossover) is analyzed. Based on general arguments, a schematic phase diagram is constructed that allows one to characterize the asymptotic states in dependence on the mutation rate and the time scale of changes. Furthermore, the notion of regular changes is raised for which the population can be shown to converge towards a generalized quasispecies. Based on this, error thresholds and an optimal mutation rate are approximately calculated for a generational genetic algorithm with a moving needle-in-the-haystack landscape. The so found phase diagram is fully consistent with our general considerations.Comment: 24 pages, 14 figures, submitted to the 2nd EvoNet Summerschoo

arXiv.org e-Print Archive

CiteSeerX